AITopics | output feature

Collaborating Authors

output feature

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

33ebd5b07dc7e407752fe773eed20635-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 10:29:36 GMT

artificial intelligence, machine learning, regional feature, (19 more...)

Neural Information Processing Systems

Country: Asia > China (0.29)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

How Control Information Influences Multilingual Text Image Generation and Editing?

Neural Information Processing SystemsMar-18-2026, 01:12:26 GMT

Visual text generation has significantly advanced through diffusion models aimed at producing images with readable and realistic text. Recent works primarily use a ControlNet-based framework, employing standard font text images to control diffusion models. Recognizing the critical role of control information in generating high-quality text, we investigate its influence from three perspectives: input encoding, role at different stages, and output features. Our findings reveal that: 1) Input control information has unique characteristics compared to conventional inputs like Canny edges and depth maps.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.79)

Add feedback

VisualizingtheEmergenceofIntermediateVisual PatternsinDNNs: SupplementaryMaterial

Neural Information Processing SystemsFeb-8-2026, 04:44:32 GMT

The visualization results revealed the semantic similarity between categories. Furthermore, Figure 2 shows the projected sample featureg at different iterations of training. Therefore, the probability density off not only depends on its orientation but also its strength. In this way,{π,µ} were updated via the following E-stepandtheM-step. This section provides more discussions on the quantification of knowledge points.

artificial intelligence, machine learning, regional feature, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > District of Columbia > Washington (0.05)
Asia > China (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.30)

Add feedback

Towards Efficient Pre-Trained Language Model via Feature Correlation Distillation

Neural Information Processing SystemsDec-24-2025, 13:07:07 GMT

Knowledge Distillation (KD) has emerged as a promising approach for compressing large Pre-trained Language Models (PLMs). The performance of KD relies on how to effectively formulate and transfer the knowledge from the teacher model to the student model. Prior arts mainly focus on directly aligning output features from the transformer block, which may impose overly strict constraints on the student model's learning process and complicate the training process by introducing extra parameters and computational cost. Moreover, our analysis indicates that the different relations within self-attention, as adopted in other works, involves more computation complexities and can easily be constrained by the number of heads, potentially leading to suboptimal solutions. To address these issues, we propose a novel approach that builds relationships directly from output features. Specifically, we introduce token-level and sequence-level relations concurrently to fully exploit the knowledge from the teacher model. Furthermore, we propose a correlation-based distillation loss to alleviate the exact match properties inherent in traditional KL divergence or MSE loss functions. Our method, dubbed FCD, presents a simple yet effective method to compress various architectures (BERT, RoBERTa, and GPT) and model sizes (base-size and large-size). Extensive experimental results demonstrate that our distilled, smaller language models significantly surpass existing KD methods across various NLP tasks.

efficient pre-trained language model, feature correlation distillation, name change, (7 more...)

Neural Information Processing Systems

Genre: Research Report > Promising Solution (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Extracting Rule-based Descriptions of Attention Features in Transformers

Friedman, Dan, Bhaskar, Adithya, Wettig, Alexander, Chen, Danqi

arXiv.org Artificial IntelligenceOct-22-2025

Mechanistic interpretability strives to explain model behavior in terms of bottom-up primitives. The leading paradigm is to express hidden states as a sparse linear combination of basis vectors, called features. However, this only identifies which text sequences (exemplars) activate which features; the actual interpretation of features requires subjective inspection of these exemplars. This paper advocates for a different solution: rule-based descriptions that match token patterns in the input and correspondingly increase or decrease the likelihood of specific output tokens. Specifically, we extract rule-based descriptions of SAE features trained on the outputs of attention layers. While prior work treats the attention layers as an opaque box, we describe how it may naturally be expressed in terms of interactions between input and output features, of which we study three types: (1) skip-gram rules of the form "[Canadian city]... speaks --> English", (2) absence rules of the form "[Montreal]... speaks -/-> English," and (3) counting rules that toggle only when the count of a word exceeds a certain value or the count of another word. Absence and counting rules are not readily discovered by inspection of exemplars, where manual and automatic descriptions often identify misleading or incomplete explanations. We then describe a simple approach to extract these types of rules automatically from a transformer, and apply it to GPT-2 small. We find that a majority of features may be described well with around 100 skip-gram rules, though absence rules are abundant even as early as the first layer (in over a fourth of features). We also isolate a few examples of counting rules. This paper lays the groundwork for future research into rule-based descriptions of features by defining them, showing how they may be extracted, and providing a preliminary taxonomy of some of the behaviors they represent.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.18148

Country: North America > Canada > Quebec > Montreal (0.24)

Genre: Research Report (0.64)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Quantifying the Accuracy-Interpretability Trade-Off in Concept-Based Sidechannel Models

Debot, David, Marra, Giuseppe

arXiv.org Artificial IntelligenceOct-17-2025

Concept Bottleneck Models (CBNMs) are deep learning models that provide interpretability by enforcing a bottleneck layer where predictions are based exclusively on human-understandable concepts. However, this constraint also restricts information flow and often results in reduced predictive accuracy. Concept Sidechannel Models (CSMs) address this limitation by introducing a sidechannel that bypasses the bottleneck and carry additional task-relevant information. While this improves accuracy, it simultaneously compromises interpretability, as predictions may rely on uninterpretable representations transmitted through sidechannels. Currently, there exists no principled technique to control this fundamental trade-off. In this paper, we close this gap. First, we present a unified probabilistic concept sidechannel meta-model that subsumes existing CSMs as special cases. Building on this framework, we introduce the Sidechannel Independence Score (SIS), a metric that quantifies a CSM's reliance on its sidechannel by contrasting predictions made with and without sidechannel information. We propose SIS regularization, which explicitly penalizes sidechannel reliance to improve interpretability. Finally, we analyze how the expressivity of the predictor and the reliance of the sidechannel jointly shape interpretability, revealing inherent trade-offs across different CSM architectures. Empirical results show that state-of-the-art CSMs, when trained solely for accuracy, exhibit low representation interpretability, and that SIS regularization substantially improves their interpretability, intervenability, and the quality of learned interpretable task predictors. Our work provides both theoretical and practical tools for developing CSMs that balance accuracy and interpretability in a principled manner.

artificial intelligence, machine learning, sidechannel, (18 more...)

arXiv.org Artificial Intelligence

2510.0567

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

EPIC: Generative AI Platform for Accelerating HPC Operational Data Analytics

Karimi, Ahmad Maroof, Shin, Woong, Hines, Jesse, Ghosal, Tirthankar, Sattar, Naw Safrin, Wang, Feiyi

arXiv.org Artificial IntelligenceSep-23-2025

We present EPIC, an AI-driven platform designed to augment operational data analytics. EPIC employs a hierarchical multi-agent architecture where a top-level large language model provides query processing, reasoning and synthesis capabilities. These capabilities orchestrate three specialized low-level agents for information retrieval, descriptive analytics, and predictive analytics. This architecture enables EPIC to perform HPC operational analytics on multi-modal data, including text, images, and tabular formats, dynamically and iteratively. EPIC addresses the limitations of existing HPC operational analytics approaches, which rely on static methods that struggle to adapt to evolving analytics tasks and stakeholder demands. Through extensive evaluations on the Frontier HPC system, we demonstrate that EPIC effectively handles complex queries. Using descriptive analytics as a use case, fine-tuned smaller models outperform large state-of-the-art foundation models, achieving up to 26% higher accuracy. Additionally, we achieved 19x savings in LLM operational costs compared to proprietary solutions by employing a hybrid approach that combines large foundational models with fine-tuned local open-weight models.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2509.16212

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.51)

Add feedback

EnOF-SNN: Training Accurate Spiking Neural Networks via Enhancing the Output Feature

Neural Information Processing SystemsMay-27-2025, 02:33:28 GMT

Spiking neural networks (SNNs) have gained more and more interest as one of the energy-efficient alternatives of conventional artificial neural networks (ANNs). They exchange 0/1 spikes for processing information, thus most of the multiplications in networks can be replaced by additions. However, binary spike feature maps will limit the expressiveness of the SNN and result in unsatisfactory performance compared with ANNs. It is shown that a rich output feature representation, i.e., the feature vector before classifier) is beneficial to training an accurate model in ANNs for classification. We wonder if it also does for SNNs and how to improve the feature representation of the SNN.To this end, we materialize this idea in two special designed methods for SNNs.First, inspired by some ANN-SNN methods that directly copy-paste the weight parameters from trained ANN with light modification to homogeneous SNN can obtain a well-performed SNN, we use rich information of the weight parameters from the trained ANN counterpart to guide the feature representation learning of the SNN. In particular, we present the SNN's and ANN's feature representation from the same input to ANN's classifier to product SNN's and ANN's outputs respectively and then align the feature with the KL-divergence loss as in knowledge distillation methods, called L_ AF loss.It can be seen as a novel and effective knowledge distillation method specially designed for the SNN that comes from both the knowledge distillation and ANN-SNN methods.

artificial intelligence, feature representation, machine learning, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

How Control Information Influences Multilingual Text Image Generation and Editing?

Neural Information Processing SystemsMay-26-2025, 16:03:41 GMT

artificial intelligence, control information, machine learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Filters

Collaborating Authors

output feature

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

33ebd5b07dc7e407752fe773eed20635-Supplemental.pdf

How Control Information Influences Multilingual Text Image Generation and Editing?

VisualizingtheEmergenceofIntermediateVisual PatternsinDNNs: SupplementaryMaterial

Towards Efficient Pre-Trained Language Model via Feature Correlation Distillation

Extracting Rule-based Descriptions of Attention Features in Transformers

Quantifying the Accuracy-Interpretability Trade-Off in Concept-Based Sidechannel Models

EPIC: Generative AI Platform for Accelerating HPC Operational Data Analytics

ec8956637a99787bd197eacd77acce5e-Paper.pdf

EnOF-SNN: Training Accurate Spiking Neural Networks via Enhancing the Output Feature

How Control Information Influences Multilingual Text Image Generation and Editing?